The dashed green line marks the date of QI intervention.
We measure demographic parity using the proportion of each
demographic and age subpopulation that was given drug testing.
Goal: Achieve parity in all race and age
groups
We measure predictive parity as the proportion of each demographic’s
subpopulation that tested positive for THC or Non-THC drugs.
Goal: Achieve parity in all race groups and drug
detection types
We measure equalized odds using the proportion of each demographic
and age subpopulation that was given drug testing with a relevant order
indication such as “Substance use during pregnancy, excluding marijuana”
and “History of opioids prescribed during pregnancy”.
Goal: Achieve parity in all race and age
groups
We measure general group equity using the proportion of each
demographic and age subpopulation that received intervention for the
correct event.
Equation: (TP+FP)/(TP+FN)
Definition:
-
True positive is # of patients tested positive for non
THC substance and was reported to CPS
- False
positive is # of patients tested negative for non THC
substance and was reported to CPS
- True negative
is # of patients tested negative for non THC substance
and wasn’t reported to CPS
- False negative is #
of patients tested positive for non THC substance and
was reported to CPS
Goal: Achieve
as close to a ratio of 1 as possible so that a group is not under-served
(ratio < 1) or over-served (ratio > 1)
We measure equal outcomes for non tested mothers using the proportion
of each demographic subpopulation that were reported to CPS
without an UDS indicating non-THC drug use. There may
be other evidence not available in this dataset that would be relevant
to such a report, so this visual serves to illustrate the relationship
between non-testing and CPS report only.
Goal:
Achieve parity in all race groups
We measure equal outcomes for non tested mothers using the proportion
of each demographic subpopulation that were reported to CPS
with an UDS indicating non-THC drug use. There may be
other evidence not available in this dataset that would be relevant to
such a report, so this visual serves to illustrate the relationship
between testing and CPS report only.
Goal: Achieve
parity in all race groups
We measure demographic parity using the proportion of each
demographic and age subpopulation that was given drug testing. If no
intervention date was given, then the same table will show for both pre
and post intervention sections.
Goal: Achieve
parity in all race and age groups
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -1.9 | -2.5, -1.3 | <0.001 |
| White | -0.82 | -1.0, -0.61 | <0.001 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | -0.19 | -0.43, 0.04 | 0.11 |
| 30 - 34 | -0.59 | -0.87, -0.31 | <0.001 |
| Over 34 | -0.60 | -0.88, -0.32 | <0.001 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -1.3 | -2.2, -0.51 | 0.003 |
| White | -0.48 | -0.81, -0.15 | 0.004 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | -0.62 | -1.1, -0.18 | 0.006 |
| 30 - 34 | -0.08 | -0.49, 0.33 | 0.7 |
| Over 34 | -0.43 | -0.90, 0.01 | 0.061 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
We measure predictive parity as the proportion of each demographic’s
subpopulation that tested positive for THC or Non-THC drugs. If no
intervention date was given, then the same table will show for both pre
and post intervention sections.
Goal: Achieve
parity in all race groups and drug detection types
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -1.6 | -4.5, 0.05 | 0.12 |
| White | 0.64 | 0.25, 1.0 | 0.001 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | 0.64 | 0.17, 1.1 | 0.007 |
| 30 - 34 | 0.87 | 0.33, 1.4 | 0.002 |
| Over 34 | 0.86 | 0.31, 1.4 | 0.002 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -1.6 | -4.5, 0.05 | 0.12 |
| White | 0.64 | 0.25, 1.0 | 0.001 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | 0.64 | 0.17, 1.1 | 0.007 |
| 30 - 34 | 0.87 | 0.33, 1.4 | 0.002 |
| Over 34 | 0.86 | 0.31, 1.4 | 0.002 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -0.74 | -2.3, 0.50 | 0.3 |
| White | -0.99 | -1.4, -0.55 | <0.001 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | 0.19 | -0.25, 0.64 | 0.4 |
| 30 - 34 | -0.16 | -0.71, 0.38 | 0.6 |
| Over 34 | -0.40 | -0.99, 0.17 | 0.2 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -0.74 | -2.3, 0.50 | 0.3 |
| White | -0.99 | -1.4, -0.55 | <0.001 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | 0.19 | -0.25, 0.64 | 0.4 |
| 30 - 34 | -0.16 | -0.71, 0.38 | 0.6 |
| Over 34 | -0.40 | -0.99, 0.17 | 0.2 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
We measure equal outcomes for non tested mothers using the proportion
of each demographic subpopulation that were reported to CPS
without an UDS indicating non-THC drug use. There may
be other evidence not available in this dataset that would be relevant
to such a report, so this visual serves to illustrate the relationship
between non-testing and CPS report only. If no intervention date was
given, then the same table will show for both pre and post intervention
sections.
Goal: Achieve parity in all race
groups
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -1.4 | -2.8, -0.37 | 0.021 |
| White | -1.3 | -1.9, -0.72 | <0.001 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | -0.07 | -0.67, 0.53 | 0.8 |
| 30 - 34 | -0.28 | -0.98, 0.38 | 0.4 |
| Over 34 | -0.52 | -1.3, 0.19 | 0.2 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -1.4 | -2.8, -0.37 | 0.021 |
| White | -1.3 | -1.9, -0.72 | <0.001 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | -0.07 | -0.67, 0.53 | 0.8 |
| 30 - 34 | -0.28 | -0.98, 0.38 | 0.4 |
| Over 34 | -0.52 | -1.3, 0.19 | 0.2 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -0.03 | -1.4, 1.2 | >0.9 |
| White | 0.16 | -0.23, 0.56 | 0.4 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | 0.29 | -0.16, 0.74 | 0.2 |
| 30 - 34 | 0.38 | -0.15, 0.91 | 0.2 |
| Over 34 | 0.39 | -0.15, 0.92 | 0.2 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Race group | |||
| Black or African American | 0.00 | — | |
| Other | -0.03 | -1.4, 1.2 | >0.9 |
| White | 0.16 | -0.23, 0.56 | 0.4 |
| Age group | |||
| Under 25 | 0.00 | — | |
| 25 - 30 | 0.29 | -0.16, 0.74 | 0.2 |
| 30 - 34 | 0.38 | -0.15, 0.91 | 0.2 |
| Over 34 | 0.39 | -0.15, 0.92 | 0.2 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
---
title: "Fairlabs Dashboard"
header-includes:
- \usepackage{comment}
output:
flexdashboard::flex_dashboard:
theme: cosmo
css: style.css
orientation: rows
vertical_layout: scroll
source_code: embed
---
```{r, include=FALSE}
library(flexdashboard)
library(tidyverse)
# library(lubridate)
library(plotly)
library(knitr)
library(janitor)
library(DT)
library(reshape2)
require(scales)
library(ggplot2)
library(gtsummary)
library(htmltools)
```
```{r}
#TODO:
# data definitions to external file? json or yaml well it's tabs now
# function for transforming/cleaning data based on definitions done
# function for calculations of summaries and stats based on grouping variables? I'm not going to do that
```
```{r}
# cleans raw data based on user defined input columns
clean_data <- function(df, input){
# Pull user defined columns
# dates
delivery_date_in <- input[input$column_out=='delivery_date','column_in']
cps_reporting_date_in <- input[input$column_out=='cps_reporting_date','column_in']
uds_collection_date_in <- input[input$column_out=='uds_collection_date','column_in']
uds_test_in <- input[input$column_out=='uds_test','column_in']
maternal_birth_date_in <- input[input$column_out=='maternal_birth_date','column_in']
# static QI intervention date from dictionary
intervention_date_in <- as.Date(input[input$column_out=='intervention_date','column_in'])
# filled if non-THC and THC are stored as individual drug columns
non_thc_col <- strsplit(input[input$column_out=='non_thc_cols','column_in'], split ='\\\\t')[[1]]
thc_col_in <- input[input$column_out=='thc_col','column_in']
# filled if non-THC and THC are stored as true/false
non_thc_detect_in <- input[input$column_out=='non_thc_detect','column_in']
thc_detect_in <- input[input$column_out=='thc_detect','column_in']
maternal_age_in <- input[input$column_out=='maternal_age','column_in']
maternal_race_in <- input[input$column_out=='maternal_race','column_in']
order_indication_in <- input[input$column_out=='order_indication','column_in']
# Used to determine which order indication is for non-THC drug use
ord_indict_non_thc_in <- strsplit(input[input$column_out=='ord_indict_non_thc','column_in'], split ='\\\\t')[[1]]
# Checks if each column exist, and format for summary and visual uses
if (delivery_date_in == "") {
df$delivery_date <- NA
df$delivery_month <- NA
} else {
df$delivery_date <- as.Date(df[[delivery_date_in]])
df$delivery_month <- as.Date(format.Date(df$delivery_date, '%Y-%m-01'))
}
if (cps_reporting_date_in == "") {
df$cps_reporting_date <- NA
df$cps_report <- NA
} else {
df$cps_reporting_date <- as.Date(df[[cps_reporting_date_in]])
df$cps_report <- if_else(is.na(df$cps_reporting_date), FALSE, TRUE)
}
if (is.na(intervention_date_in)) {
df$pre_post_QI <- NA
} else {
df <- df %>%
mutate(pre_post_QI = factor(if_else(delivery_date >= intervention_date_in, 'Post', 'Pre'),
levels = c('Pre', 'Post')))
}
if ((uds_collection_date_in == "" & uds_test_in != "")) {
df$uds_collection_date <- NA
df$uds_test <- df[[uds_test_in]]
} else if (uds_collection_date_in != "") {
df$uds_collection_date <- as.Date(df[[uds_collection_date_in]])
df$uds_test <- if_else(is.na(df$uds_collection_date), FALSE, TRUE)
} else {
df$uds_collection_date <- NA
df$uds_test <- NA
}
if (thc_detect_in != "" & thc_col_in == "") {
df$thc_detect <- df[[thc_detect_in]]
} else if (thc_col_in != "") {
df$detected_tetrahydrocannabinol <- df[[thc_col_in]]
df$thc_detect <- if_else(df$detected_tetrahydrocannabinol==1, TRUE, FALSE)
} else {
df$thc_detect <- NA
}
if (non_thc_detect_in != "" & is_empty(non_thc_col)) {
df$non_thc_detect <- df[[non_thc_detect_in]]
} else if (!is_empty(non_thc_col)) {
df <- df %>%
mutate(non_thc_detect = if_else(rowSums(df[non_thc_col]) > 0, TRUE, FALSE))
} else {
df$non_thc_detect <- NA
}
if (maternal_age_in != "") {
df$maternal_age <- df[[maternal_age_in]]
df <- df %>%
mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25',
maternal_age < 30 ~ '25 - 30',
maternal_age < 34 ~ '30 - 34',
TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
} else if (maternal_birth_date_in != "") {
df$maternal_birth_date = as.Date(df[[maternal_birth_date_in]])
df$maternal_age <- floor(as.numeric(difftime(delivery_date, maternal_birth_date, units = 'days'))/365.25)
df <- df %>%
mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25',
maternal_age < 30 ~ '25 - 30',
maternal_age < 34 ~ '30 - 34',
TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
} else {
df$maternal_age <- NA
df$age_group <- NA
}
if (maternal_race_in != "") {
df$maternal_race <- df[[maternal_race_in]]
df <- df %>%
mutate(race_group = case_when(maternal_race == 'Black or African American' ~ maternal_race,
maternal_race == 'White' ~ maternal_race,
TRUE ~ 'Other'))
} else {
df$maternal_race <- NA
df$race_group <- NA
}
if (order_indication_in == "") {
df$ord_indict_non_thc <- NA
} else {
df$order_indication <- df[[order_indication_in]]
df$ord_indict_non_thc <- if_else(df$order_indication %in% ord_indict_non_thc_in, TRUE, FALSE)
}
df
}
```
```{r}
data <- read.csv("/mnt/home/jzhang92/fairlabs/fairlabs_data.csv")
# data <- read.csv("/mnt/home/jzhang92/fairlabs/fairlabs_local_data_j.csv")
input <- read.csv("/mnt/home/jzhang92/fairlabs/fairlab_input_dict.txt", sep = '\t')
# input <- read.csv("/mnt/home/jzhang92/fairlabs/fairlab_input_dict_l.txt", sep = '\t')
race_pal <- setNames(object = c("#66c2a5","#fc8d62","#8da0cb"), nm = c('Black or African American', 'White', 'Other'))
age_pal <- setNames(object = c("#f0f9e8","#bae4bc","#7bccc4", '#2b8cbe'), nm = c('Under 25', '25 - 30', '30 - 34', 'Over 34'))
intervention_date <- as.Date(input[input$column_out=='intervention_date','column_in'])
data <- clean_data(data, input)
```
```{r}
# data summary
## demographic parity
demo_sum <- data %>%
group_by(race_group,age_group,pre_post_QI,delivery_month,uds_test) %>%
count() %>%
ungroup() %>%
group_by(race_group,age_group,pre_post_QI,delivery_month) %>%
mutate(total = sum(n),
perc_total = round(n/total, digits = 2)) %>%
ungroup()
## predictive parity
pred_non_thc_sum <- data %>%
filter(uds_test) %>%
group_by(race_group,pre_post_QI,delivery_month,non_thc_detect) %>%
count() %>%
ungroup() %>%
group_by(race_group,pre_post_QI,delivery_month) %>%
mutate(total = sum(n),
perc_total = round(n/total, digits = 2)) %>%
ungroup()
pred_thc_sum <- data %>%
filter(uds_test) %>%
group_by(race_group,pre_post_QI,delivery_month,thc_detect) %>%
count() %>%
ungroup() %>%
group_by(race_group,pre_post_QI,delivery_month) %>%
mutate(total = sum(n),
perc_total = round(n/total, digits = 2)) %>%
ungroup()
## equalized odds depends on order indication data
if (sum(is.na(data$ord_indict_non_thc))==nrow(data)) {
equal_odds_wide <- NA
error_ord <- "The data set is missing order indication data. The dashboard cannot create this visual or table."
} else {
error_ord <- NA
## equalized odds
equal_odds <- data %>%
filter(uds_test) %>%
group_by(race_group,age_group,pre_post_QI,ord_indict_non_thc,non_thc_detect) %>%
count() %>%
ungroup() %>%
mutate(ord_ind_detect = factor(case_when(ord_indict_non_thc & non_thc_detect ~ 'TP',
ord_indict_non_thc & !non_thc_detect ~ 'FP',
!ord_indict_non_thc & !non_thc_detect ~ 'TN',
!ord_indict_non_thc & non_thc_detect ~ 'FN')))
equal_odds_wide <- pivot_wider(equal_odds%>%
select(c("race_group","age_group","pre_post_QI",'ord_ind_detect', 'n')), names_from = ord_ind_detect, values_from = n) %>%
mutate(FPR = (FP/(FP+TN)),
TPR = (TP/(TP+FN)),
ratio = FPR/TPR)
}
## equal outcome and group benefit depends on cps report data
if (sum(is.na(data$cps_report))==nrow(data)) {
equal_out <- NA
group_benefit_wide <- NA
error_cps <- "The data set is missing CPS report data. The dashboard cannot create this visual or table."
} else {
error_cps <- NA
## equal outcomes
equal_out <- data %>%
# filter(!is.na(cps_reporting_date)) %>%
group_by(uds_test, race_group, non_thc_detect,cps_report, pre_post_QI,delivery_month) %>%
count() %>%
ungroup() %>%
group_by(uds_test, race_group,pre_post_QI,delivery_month) %>%
mutate(total = sum(n),
perc_total = round(n/total, digits = 2)) %>%
ungroup()
## group benefit equality
group_benefit <- data %>%
filter(uds_test) %>%
group_by(race_group,age_group,pre_post_QI,cps_report,non_thc_detect) %>%
count() %>%
ungroup() %>%
mutate(cps_detect = factor(case_when(cps_report & non_thc_detect ~ 'TP',
cps_report & !non_thc_detect ~ 'FP',
!cps_report & !non_thc_detect ~ 'TN',
!cps_report & non_thc_detect ~ 'FN')))
# mutate(cps_detect = factor(case_when(cps_report & non_thc_detect ~ 'TP',
# !cps_report & non_thc_detect ~ 'FP',
# !cps_report & !non_thc_detect ~ 'TN',
# cps_report & !non_thc_detect ~ 'FN')))
group_benefit_wide <- pivot_wider(group_benefit %>%
select(c("race_group","age_group","pre_post_QI",'cps_detect', 'n')), names_from = cps_detect, values_from = n) %>%
mutate(ratio = ((TP+FP)/(TP+FN)))
}
```
```{r, child = 'visuals.rmd'}
```
```{r, child = 'section.rmd'}
```